exponnorm (Exponentially Modified Normal / exGaussian)#
The exponnorm distribution (SciPy’s name) is the distribution of a normal random variable plus an independent exponential delay.
It’s a simple, interpretable model for right-skewed continuous data: symmetric measurement noise (Normal) + one-sided waiting time (Exponential).
What you’ll learn#
what
exponnormmodels and when it’s appropriatethe PDF/CDF in closed form (and how SciPy parameterizes it)
mean/variance/skewness/kurtosis, plus MGF/characteristic function and entropy notes
how parameters change the shape (skew and tail behavior)
key derivations: expectation, variance, likelihood
NumPy-only sampling via the “Normal + Exponential” construction
SciPy usage:
scipy.stats.exponnorm(pdf,cdf,rvs,fit)statistical use cases: hypothesis testing, Bayesian modeling patterns, generative modeling
Notebook roadmap#
Title & classification
Intuition & motivation
Formal definition (PDF/CDF)
Moments & properties
Parameter interpretation
Derivations ((\mathbb{E}[X]), (\mathrm{Var}(X)), likelihood)
Sampling & simulation (NumPy-only)
Visualization (PDF, CDF, Monte Carlo)
SciPy integration (
scipy.stats.exponnorm)Statistical use cases
Pitfalls
Summary
import math
import numpy as np
import scipy
from scipy import special, stats
from scipy.optimize import brentq, minimize
import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
SEED = 7
rng = np.random.default_rng(SEED)
np.set_printoptions(precision=6, suppress=True)
print("numpy ", np.__version__)
print("scipy ", scipy.__version__)
print("plotly", plotly.__version__)
numpy 1.26.2
scipy 1.15.0
plotly 6.5.2
Prerequisites & notation#
Prerequisites
basic probability (PDF/CDF, independence, expectation)
comfort with calculus (a convolution integral)
Notation
standard normal PDF/CDF: (\phi(\cdot)), (\Phi(\cdot))
complementary error function: (\mathrm{erfc}(\cdot))
Parameterizations
A common “exGaussian” construction is:
[ X = N + D,\quad N \sim \mathcal{N}(\mu,\sigma^2),\quad D \sim \mathrm{Exp}(\text{rate}=\lambda),\quad N \perp D. ]
SciPy’s exponnorm uses a dimensionless shape parameter (K) instead of (\lambda):
stats.exponnorm(K, loc=μ, scale=σ)mapping: (K = \frac{1}{\sigma\lambda}) (so (\lambda = \frac{1}{K\sigma}))
exponential mean (delay scale): (\tau = 1/\lambda = K\sigma)
In the standardized form (loc=0, scale=1): (Y = Z + E) with
(Z\sim \mathcal{N}(0,1)) and (E\sim \mathrm{Exp}(\text{rate}=1/K)).
1) Title & Classification#
Name:
exponnorm(Exponentially modified normal / exGaussian)Type: continuous
Support: (x \in \mathbb{R})
Parameter space (SciPy):
(K > 0) (shape)
(\text{loc} \in \mathbb{R}) (location)
(\text{scale} > 0) (scale)
A useful generative view (SciPy parameterization):
[ X = \text{loc} + \text{scale},(Z + E), \quad Z \sim \mathcal{N}(0,1), \quad E \sim \mathrm{Exp}(\text{rate}=1/K), \quad Z \perp E. ]
Equivalently, (X = N + D) with (N \sim \mathcal{N}(\text{loc},,\text{scale}^2)) and (D \sim \mathrm{Exp}(\text{rate}=1/(K,\text{scale}))).
2) Intuition & Motivation#
What it models#
exponnorm is a convolution of a normal and an exponential distribution.
Think of an observation as:
a symmetric baseline measurement (N) (sensor noise, natural variability), plus
a one-sided random delay (D \ge 0) (queueing, reaction-time delay, processing latency).
The exponential part creates a long right tail and positive skew, while the normal part keeps the distribution “normal-like” near its center.
Typical real-world use cases#
Reaction times (psychology / neuroscience): decision time + motor delay, often right-skewed.
Network / system latency: baseline jitter + occasional queueing delays.
Service times in queues: variability around a typical duration plus delay bursts.
Chromatography / mass spectrometry: peak shapes with exponential tailing.
Relations to other distributions#
(K \to 0) (or (\lambda \to \infty)): the exponential delay vanishes and the distribution approaches a normal.
(\sigma \to 0) (keeping (\tau) fixed): the normal collapses and you approach a shifted exponential.
Compared to lognormal or gamma,
exponnormhas an explicit “noise + delay” interpretation and a very simple sampling story.
3) Formal Definition#
Hierarchical (convolution) definition#
Let (Z \sim \mathcal{N}(0,1)) and (E \sim \mathrm{Exp}(\text{rate}=1/K)) be independent with (K>0). Define the standardized variable (Y = Z + E). Then:
[ Y \sim \mathrm{exponnorm}(K). ]
The general (shifted/scaled) variable is:
[ X = \text{loc} + \text{scale},Y. ]
PDF#
For the standardized form ((\text{loc}=0), (\text{scale}=1)), SciPy uses:
[ f_Y(y;K) = \frac{1}{2K}\exp!\left(\frac{1}{2K^2}-\frac{y}{K}\right), \mathrm{erfc}!\left(\frac{1/K - y}{\sqrt{2}}\right), \qquad y\in\mathbb{R},;K>0. ]
The shifted/scaled PDF is obtained by the change of variables (y=(x-\text{loc})/\text{scale}):
[ f_X(x;K,\text{loc},\text{scale}) = \frac{1}{\text{scale}}, f_Y!\left(\frac{x-\text{loc}}{\text{scale}};K\right). ]
CDF#
A convenient closed form in terms of the standard normal CDF (\Phi) is:
[ F_Y(y;K) = \Phi(y)
\exp!\left(\frac{1}{2K^2}-\frac{y}{K}\right), \Phi!\left(y-\frac{1}{K}\right). ]
Again, (F_X(x) = F_Y!\big((x-\text{loc})/\text{scale}\big)).
We’ll implement these formulas (carefully, in log-space where helpful) and check them against SciPy.
def exgauss_to_scipy(mu: float, sigma: float, lam: float) -> tuple[float, float, float]:
"""Map (μ, σ, λ) exGaussian parameters to SciPy's (K, loc, scale)."""
if sigma <= 0:
raise ValueError("sigma must be > 0")
if lam <= 0:
raise ValueError("lam must be > 0")
K = 1.0 / (sigma * lam)
return float(K), float(mu), float(sigma)
def scipy_to_exgauss(K: float, loc: float, scale: float) -> tuple[float, float, float, float]:
"""Map SciPy's (K, loc, scale) to (μ, σ, λ, τ) where τ=1/λ is the exp mean."""
if K <= 0:
raise ValueError("K must be > 0")
if scale <= 0:
raise ValueError("scale must be > 0")
lam = 1.0 / (K * scale)
tau = 1.0 / lam
return float(loc), float(scale), float(lam), float(tau)
def exponnorm_logpdf_standard(y: np.ndarray, K: float) -> np.ndarray:
"""Log-PDF of the standardized exponnorm (loc=0, scale=1).
Formula (SciPy):
f(y) = 1/(2K) * exp(1/(2K^2) - y/K) * erfc((1/K - y)/sqrt(2))
We evaluate it stably via erfcx(z) = exp(z^2) * erfc(z).
"""
y = np.asarray(y, dtype=float)
if K <= 0:
raise ValueError("K must be > 0")
z = (1.0 / K - y) / np.sqrt(2.0)
return (
-np.log(2.0 * K)
+ (1.0 / (2.0 * K * K))
- y / K
- z * z
+ np.log(special.erfcx(z))
)
def exponnorm_logpdf(x: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""Log-PDF of exponnorm(K, loc, scale) using the standardized log-PDF."""
x = np.asarray(x, dtype=float)
if scale <= 0:
raise ValueError("scale must be > 0")
y = (x - loc) / scale
return exponnorm_logpdf_standard(y, K) - np.log(scale)
def exponnorm_pdf(x: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
return np.exp(exponnorm_logpdf(x, K, loc=loc, scale=scale))
def exponnorm_cdf_standard(y: np.ndarray, K: float) -> np.ndarray:
"""CDF of the standardized exponnorm.
Closed form:
F(y) = Φ(y) - exp(1/(2K^2) - y/K) * Φ(y - 1/K)
We evaluate the second term using logcdf for stability.
"""
y = np.asarray(y, dtype=float)
if K <= 0:
raise ValueError("K must be > 0")
term1 = stats.norm.cdf(y)
log_term2 = (1.0 / (2.0 * K * K)) - y / K + stats.norm.logcdf(y - 1.0 / K)
term2 = np.exp(log_term2)
return term1 - term2
def exponnorm_cdf(x: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
x = np.asarray(x, dtype=float)
if scale <= 0:
raise ValueError("scale must be > 0")
y = (x - loc) / scale
return exponnorm_cdf_standard(y, K)
def exponnorm_stats_closed_form(
K: float,
loc: float = 0.0,
scale: float = 1.0,
) -> tuple[float, float, float, float]:
"""Return (mean, var, skew, excess_kurtosis)."""
if K <= 0:
raise ValueError("K must be > 0")
if scale <= 0:
raise ValueError("scale must be > 0")
mean = loc + scale * K
var = (scale * scale) * (1.0 + K * K)
skew = 2.0 * (K**3) / ((1.0 + K * K) ** 1.5)
excess_kurt = 6.0 * (K**4) / ((1.0 + K * K) ** 2.0)
return float(mean), float(var), float(skew), float(excess_kurt)
def exponnorm_mgf(t: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""MGF M_X(t) = E[exp(tX)]. Exists only for t < 1/(K*scale)."""
t = np.asarray(t, dtype=float)
if K <= 0:
raise ValueError("K must be > 0")
if scale <= 0:
raise ValueError("scale must be > 0")
denom = 1.0 - K * scale * t
if np.any(denom <= 0):
raise ValueError("MGF exists only for t < 1/(K*scale).")
return np.exp(loc * t + 0.5 * (scale * t) ** 2) / denom
def exponnorm_cf(t: np.ndarray, K: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""Characteristic function φ_X(t) = E[exp(i t X)]."""
t = np.asarray(t, dtype=float)
if K <= 0:
raise ValueError("K must be > 0")
if scale <= 0:
raise ValueError("scale must be > 0")
denom = 1.0 - 1j * K * scale * t
return np.exp(1j * loc * t - 0.5 * (scale * t) ** 2) / denom
def sample_exponnorm_numpy(
n: int,
K: float,
rng: np.random.Generator,
loc: float = 0.0,
scale: float = 1.0,
) -> np.ndarray:
"""NumPy-only sampling via X = loc + scale * (Z + E).
Z ~ N(0,1)
E ~ Exp(rate=1/K) (mean K)
"""
if n <= 0:
raise ValueError("n must be positive")
if K <= 0:
raise ValueError("K must be > 0")
if scale <= 0:
raise ValueError("scale must be > 0")
z = rng.normal(size=n)
e = rng.exponential(scale=K, size=n)
return loc + scale * (z + e)
def sample_moments(x: np.ndarray) -> tuple[float, float, float, float]:
"""Return (mean, var, skew, excess_kurtosis) using population moments (ddof=0)."""
x = np.asarray(x, dtype=float)
m = float(np.mean(x))
v = float(np.var(x, ddof=0))
s = float(np.sqrt(v))
if s == 0:
return m, v, np.nan, np.nan
z = (x - m) / s
skew = float(np.mean(z**3))
excess_kurt = float(np.mean(z**4) - 3.0)
return m, v, skew, excess_kurt
def K_from_skewness(gamma1: float, eps: float = 1e-10) -> float:
"""Invert gamma1 = 2 K^3 / (1 + K^2)^(3/2) for K>0.
Theoretical range: gamma1 ∈ [0, 2). We clip slightly below 2 for stability.
"""
gamma1 = float(np.clip(gamma1, 0.0, 2.0 - 1e-12))
if gamma1 < eps:
return eps
def f(K: float) -> float:
return 2.0 * (K**3) / ((1.0 + K * K) ** 1.5) - gamma1
return float(brentq(f, eps, 1e6))
def exponnorm_mom_initial_guess(x: np.ndarray) -> tuple[float, float, float]:
"""Method-of-moments initial guess (K, loc, scale) from mean/var/skew."""
m, v, skew, _ = sample_moments(x)
K0 = K_from_skewness(skew if np.isfinite(skew) else 0.0)
scale0 = float(np.sqrt(v / (1.0 + K0 * K0)))
loc0 = float(m - scale0 * K0)
return float(K0), float(loc0), float(scale0)
def x_grid_for_plotting(rv, q_low: float = 1e-3, q_high: float = 0.999, n: int = 600) -> np.ndarray:
"""Choose a reasonable finite x-grid for plotting a continuous distribution."""
lo = float(rv.ppf(q_low))
hi = float(rv.ppf(q_high))
if not (np.isfinite(lo) and np.isfinite(hi) and hi > lo):
mean, var = rv.stats(moments="mv")
mean = float(mean)
var = float(var)
if np.isfinite(mean) and np.isfinite(var) and var > 0:
std = float(np.sqrt(var))
lo = mean - 6 * std
hi = mean + 10 * std # skewed right: give more room on the right
else:
lo, hi = -5.0, 15.0
return np.linspace(lo, hi, n)
def empirical_cdf(samples: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
samples = np.asarray(samples, dtype=float)
xs = np.sort(samples)
n = xs.size
ys = np.arange(1, n + 1) / n
return xs, ys
# Quick correctness check vs SciPy (PDF/CDF + moments)
K, loc, scale = 1.5, -0.2, 0.8
rv = stats.exponnorm(K, loc=loc, scale=scale)
x = x_grid_for_plotting(rv, q_low=1e-4, q_high=0.9999, n=500)
pdf_err = float(np.max(np.abs(exponnorm_pdf(x, K, loc=loc, scale=scale) - rv.pdf(x))))
cdf_err = float(np.max(np.abs(exponnorm_cdf(x, K, loc=loc, scale=scale) - rv.cdf(x))))
mean_cf, var_cf, skew_cf, kurt_cf = exponnorm_stats_closed_form(K, loc=loc, scale=scale)
mean_sp, var_sp, skew_sp, kurt_sp = rv.stats(moments="mvsk")
{
"max|pdf - SciPy|": pdf_err,
"max|cdf - SciPy|": cdf_err,
"mean (closed, SciPy)": (mean_cf, float(mean_sp)),
"var (closed, SciPy)": (var_cf, float(var_sp)),
"skew (closed, SciPy)": (skew_cf, float(skew_sp)),
"kurt_excess (closed, SciPy)": (kurt_cf, float(kurt_sp)),
}
{'max|pdf - SciPy|': 1.942890293094024e-16,
'max|cdf - SciPy|': 2.220446049250313e-16,
'mean (closed, SciPy)': (1.0000000000000002, 1.0000000000000002),
'var (closed, SciPy)': (2.0800000000000005, 2.08),
'skew (closed, SciPy)': (1.1520696383139373, 1.1520696383139375),
'kurt_excess (closed, SciPy)': (2.8757396449704142, 2.8757396449704142)}
4) Moments & Properties#
Mean, variance, skewness, kurtosis#
Using the additive construction (X = N + D) (normal + exponential), moments follow cleanly.
For SciPy’s parameters ((K,\text{loc},\text{scale})):
[ \mathbb{E}[X] = \text{loc} + K,\text{scale} ]
[ \mathrm{Var}(X) = \text{scale}^2,(1+K^2) ]
Skewness and excess kurtosis depend only on (K) (location/scale don’t change these shape measures):
[ \gamma_1 = \frac{2K^3}{(1+K^2)^{3/2}}, \qquad \gamma_2 = \frac{6K^4}{(1+K^2)^2}. ]
As (K\to 0), (\gamma_1\to 0) and you approach a normal; as (K\to \infty), (\gamma_1\to 2) and you approach an exponential-like skew.
MGF / characteristic function#
Because (X) is a sum of independent variables, MGFs multiply.
For (X\sim\mathrm{exponnorm}(K,\text{loc},\text{scale})):
[ M_X(t) = \mathbb{E}[e^{tX}] = \frac{\exp!\left(\text{loc},t + \tfrac{1}{2}(\text{scale},t)^2\right)}{1 - K,\text{scale},t}, \qquad t < \frac{1}{K,\text{scale}}. ]
[ \varphi_X(t) = \mathbb{E}[e^{itX}] = \frac{\exp!\left(i,\text{loc},t - \tfrac{1}{2}(\text{scale},t)^2\right)}{1 - iK,\text{scale},t}. ]
Entropy#
The differential entropy (h(X) = -\mathbb{E}[\log f_X(X)]) does not have a simple elementary closed form.
In practice you compute it numerically (SciPy provides entropy), and scaling behaves as usual:
(h(\text{loc}+\text{scale},Y)=h(Y)+\log(\text{scale})).
# Moment checks + MGF/CF + entropy sanity checks
K, loc, scale = 1.2, -0.5, 0.8
rv = stats.exponnorm(K, loc=loc, scale=scale)
# Closed form vs SciPy
closed = exponnorm_stats_closed_form(K, loc=loc, scale=scale)
scipy_stats = tuple(float(v) for v in rv.stats(moments="mvsk"))
print("(mean, var, skew, kurt_excess)")
print("closed:", closed)
print("scipy :", scipy_stats)
# Monte Carlo checks
n = 200_000
samples = rv.rvs(size=n, random_state=rng)
mc = sample_moments(samples)
print("mc :", mc)
# MGF at a safe t
# Domain: t < 1/(K*scale)
t = 0.4
mgf_theory = float(exponnorm_mgf(t, K, loc=loc, scale=scale))
mgf_mc = float(np.mean(np.exp(t * samples)))
# Characteristic function at a frequency
w = 1.0
cf_theory = exponnorm_cf(w, K, loc=loc, scale=scale)
cf_mc = np.mean(np.exp(1j * w * samples))
print("mgf theory vs mc:", mgf_theory, mgf_mc)
print("cf theory vs mc:", cf_theory, cf_mc)
# Entropy: SciPy vs Monte Carlo estimate -E[log f(X)]
h_scipy = float(rv.entropy())
h_mc = float(-np.mean(rv.logpdf(samples)))
print("entropy scipy vs mc:", h_scipy, h_mc)
(mean, var, skew, kurt_excess)
closed: (0.45999999999999996, 1.5616000000000003, 0.9067529857542795, 2.0897608169846813)
scipy : (0.45999999999999996, 1.5616, 0.9067529857542795, 2.0897608169846817)
mc : (0.4597599350100935, 1.5543152074902815, 0.9030605189927701, 2.0771188553706823)
mgf theory vs mc: 1.3989309187572552 1.3974357979472805
cf theory vs mc: (0.5055499322164319+0.1371935417217969j) (0.5057498583668312+0.13764341874813715j)
entropy scipy vs mc: 1.5894958483983916 1.5875920033245619
5) Parameter Interpretation#
Shape parameter (K)#
(K) is dimensionless: it’s the exponential mean measured in units of the normal standard deviation.
In the ((\mu,\sigma,\lambda)) parameterization: (K = 1/(\sigma\lambda) = \tau/\sigma).
Intuition:
small (K): the exponential delay is tiny relative to the normal noise (\Rightarrow) nearly symmetric (almost normal)
large (K): the delay dominates (\Rightarrow) strong right tail and skew
loc and scale#
locshifts the distribution left/right.scalestretches the distribution and also scales the exponential delay (since the exponential component is multiplied byscale).
Mean/variance reminders: (\mathbb{E}[X]=\text{loc}+K,\text{scale}), (\mathrm{Var}(X)=\text{scale}^2(1+K^2)).
Shape changes#
Below we vary (K) (keeping the mean fixed) and vary scale to see how the PDF/CDF morph.
# Shape changes as K varies (keep mean fixed at 0 by setting loc = -K when scale=1)
Ks = [0.1, 0.5, 1.0, 2.0]
rvs = [stats.exponnorm(K, loc=-K, scale=1.0) for K in Ks]
lo = min(float(rv.ppf(0.001)) for rv in rvs)
hi = max(float(rv.ppf(0.999)) for rv in rvs)
x = np.linspace(lo, hi, 700)
fig = make_subplots(rows=1, cols=2, subplot_titles=("PDF", "CDF"))
for K, rv in zip(Ks, rvs):
fig.add_trace(go.Scatter(x=x, y=rv.pdf(x), mode="lines", name=f"K={K:g}"), row=1, col=1)
fig.add_trace(go.Scatter(x=x, y=rv.cdf(x), mode="lines", showlegend=False), row=1, col=2)
fig.add_vline(x=0.0, line_dash="dot", opacity=0.35)
fig.update_layout(title="exponnorm: varying K (mean aligned to 0)")
fig.update_xaxes(title_text="x", row=1, col=1)
fig.update_xaxes(title_text="x", row=1, col=2)
fig.update_yaxes(title_text="density", row=1, col=1)
fig.update_yaxes(title_text="cdf", row=1, col=2)
fig.show()
# Scale changes (keep mean fixed at 0 by setting loc = -scale*K)
K = 1.0
scales = [0.5, 1.0, 2.0]
rvs = [stats.exponnorm(K, loc=-(s * K), scale=s) for s in scales]
lo = min(float(rv.ppf(0.001)) for rv in rvs)
hi = max(float(rv.ppf(0.999)) for rv in rvs)
x = np.linspace(lo, hi, 700)
fig = go.Figure()
for s, rv in zip(scales, rvs):
fig.add_trace(go.Scatter(x=x, y=rv.pdf(x), mode="lines", name=f"scale={s:g}"))
fig.add_vline(x=0.0, line_dash="dot", opacity=0.35)
fig.update_layout(
title="exponnorm: varying scale (mean aligned to 0, K fixed)",
xaxis_title="x",
yaxis_title="density",
)
fig.show()
6) Derivations#
6.1 Expectation#
Using the exGaussian sum representation:
[ X = N + D, \quad N\sim\mathcal{N}(\mu,\sigma^2), \quad D\sim\mathrm{Exp}(\text{rate}=\lambda), \quad N\perp D. ]
Linearity of expectation gives:
[ \mathbb{E}[X] = \mathbb{E}[N] + \mathbb{E}[D] = \mu + \frac{1}{\lambda}. ]
Mapping to SciPy, (\mu=\text{loc}) and (1/\lambda=\tau = K,\text{scale}), so (\mathbb{E}[X]=\text{loc}+K,\text{scale}).
6.2 Variance#
Independence implies variances add:
[ \mathrm{Var}(X) = \mathrm{Var}(N) + \mathrm{Var}(D) = \sigma^2 + \frac{1}{\lambda^2}. ]
Under SciPy’s parameterization, (\sigma=\text{scale}) and (1/\lambda=K,\text{scale}), hence (\mathrm{Var}(X)=\text{scale}^2(1+K^2)).
6.3 Likelihood (for data)#
Given i.i.d. observations (x_1,\dots,x_n), the likelihood is
[ L(K,\text{loc},\text{scale}) = \prod_{i=1}^n f_X(x_i;K,\text{loc},\text{scale}). ]
The log-likelihood is
[ \ell = \sum_{i=1}^n \log f_X(x_i;K,\text{loc},\text{scale}) = -n\log(\text{scale}) + \sum_{i=1}^n \log f_Y!\left(\frac{x_i-\text{loc}}{\text{scale}};K\right), ]
where (f_Y) is the standardized density.
There is no simple closed-form MLE; numerical optimization (or scipy.stats.exponnorm.fit) is used.
# Likelihood + MLE demo (SciPy fit vs a simple custom optimizer)
# Simulate data from a known parameter set
K_true, loc_true, scale_true = 1.3, -0.5, 0.8
n = 3_000
x = sample_exponnorm_numpy(n, K=K_true, loc=loc_true, scale=scale_true, rng=rng)
# SciPy MLE
K_hat_scipy, loc_hat_scipy, scale_hat_scipy = stats.exponnorm.fit(x)
def loglik(x: np.ndarray, K: float, loc: float, scale: float) -> float:
return float(np.sum(exponnorm_logpdf(x, K, loc=loc, scale=scale)))
def fit_exponnorm_mle(x: np.ndarray) -> tuple[float, float, float]:
x = np.asarray(x, dtype=float)
K0, loc0, scale0 = exponnorm_mom_initial_guess(x)
theta0 = np.array([math.log(K0), loc0, math.log(scale0)], dtype=float)
def nll(theta: np.ndarray) -> float:
logK, loc, logscale = float(theta[0]), float(theta[1]), float(theta[2])
K = float(np.exp(logK))
scale = float(np.exp(logscale))
if not (np.isfinite(K) and np.isfinite(loc) and np.isfinite(scale) and scale > 0):
return np.inf
return -loglik(x, K, loc=loc, scale=scale)
res = minimize(nll, x0=theta0, method="Nelder-Mead")
logK_hat, loc_hat, logscale_hat = (float(v) for v in res.x)
return float(np.exp(logK_hat)), float(loc_hat), float(np.exp(logscale_hat))
K_hat_opt, loc_hat_opt, scale_hat_opt = fit_exponnorm_mle(x)
print("true :", (K_true, loc_true, scale_true))
print("scipy:", (float(K_hat_scipy), float(loc_hat_scipy), float(scale_hat_scipy)))
print("opt :", (K_hat_opt, loc_hat_opt, scale_hat_opt))
print("loglik true :", loglik(x, K_true, loc_true, scale_true))
print("loglik scipy:", loglik(x, float(K_hat_scipy), float(loc_hat_scipy), float(scale_hat_scipy)))
print("loglik opt :", loglik(x, K_hat_opt, loc_hat_opt, scale_hat_opt))
true : (1.3, -0.5, 0.8)
scipy: (1.343323830978064, -0.5385454706669304, 0.8036398274203335)
opt : (1.343429820268832, -0.538601664756589, 0.8036168180048361)
loglik true : -4950.712560390255
loglik scipy: -4949.376344116159
loglik opt : -4949.376344452656
7) Sampling & Simulation (NumPy-only)#
Algorithm (Normal + Exponential)#
Use the defining construction:
Sample (Z \sim \mathcal{N}(0,1)).
Sample (E \sim \mathrm{Exp}(\text{rate}=1/K)) (mean (K)).
Set (Y = Z + E).
Return (X = \text{loc} + \text{scale},Y).
This is efficient and numerically stable because it uses only well-behaved base RNGs.
# Monte Carlo sanity check of the NumPy-only sampler
K, loc, scale = 0.9, -0.3, 1.1
n = 200_000
samples = sample_exponnorm_numpy(n, K=K, loc=loc, scale=scale, rng=rng)
mc = sample_moments(samples)
closed = exponnorm_stats_closed_form(K, loc=loc, scale=scale)
print("(mean, var, skew, kurt_excess)")
print("mc :", mc)
print("closed:", closed)
(mean, var, skew, kurt_excess)
mc : (0.6940026389829959, 2.190001974195752, 0.5937727858392163, 1.170714839086485)
closed: (0.6900000000000002, 2.1901000000000006, 0.5987419144908114, 1.201611672415372)
8) Visualization#
We’ll visualize:
the PDF and CDF
Monte Carlo samples (histogram)
empirical CDF vs the theoretical CDF
# PDF/CDF and Monte Carlo comparison
K, loc, scale = 0.9, -0.3, 1.1
rv = stats.exponnorm(K, loc=loc, scale=scale)
n = 80_000
samples = sample_exponnorm_numpy(n, K=K, loc=loc, scale=scale, rng=rng)
x_grid = x_grid_for_plotting(rv, q_low=0.001, q_high=0.999, n=600)
# Histogram + PDF
fig = px.histogram(
samples,
nbins=80,
histnorm="probability density",
title=f"Monte Carlo samples vs PDF (n={n:,}, K={K:g}, loc={loc:g}, scale={scale:g})",
labels={"value": "x"},
)
fig.add_trace(go.Scatter(x=x_grid, y=rv.pdf(x_grid), mode="lines", name="SciPy PDF"))
fig.update_layout(yaxis_title="density")
fig.show()
# Empirical CDF vs theoretical CDF
xs, ys = empirical_cdf(samples)
x_grid = np.linspace(float(np.quantile(xs, 0.001)), float(np.quantile(xs, 0.999)), 600)
fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=ys, mode="lines", name="empirical CDF"))
fig.add_trace(go.Scatter(x=x_grid, y=rv.cdf(x_grid), mode="lines", name="theoretical CDF"))
fig.update_layout(
title=f"Empirical CDF vs theoretical CDF (n={n:,})",
xaxis_title="x",
yaxis_title="F(x)",
)
fig.show()
9) SciPy Integration#
SciPy implements this distribution as scipy.stats.exponnorm.
Useful methods:
pdf,logpdfcdf,logcdfsf,logsf(often more accurate in the tail)rvsfor samplingfitfor MLE parameter estimation
About fit
exponnorm.fit(data)estimates ((K,\text{loc},\text{scale})) via maximum likelihood.As (K\to 0), the distribution approaches a normal and the MLE can become numerically delicate (boundary effects).
For more control, you can fix parameters (e.g.,
floc=...) or fit via a custom optimizer onlogpdf.
# Basic SciPy usage + fit demo
K, loc, scale = 1.1, -0.4, 0.9
rv = stats.exponnorm(K, loc=loc, scale=scale)
xs = np.array([-2.0, -1.0, 0.0, 1.0, 2.0])
print("pdf:", rv.pdf(xs))
print("cdf:", rv.cdf(xs))
# Sampling via SciPy
samples_scipy = rv.rvs(size=5_000, random_state=rng)
# Fit parameters back
K_hat, loc_hat, scale_hat = stats.exponnorm.fit(samples_scipy)
print("fit (K, loc, scale):", (float(K_hat), float(loc_hat), float(scale_hat)))
pdf: [0.027719 0.161068 0.327328 0.275103 0.129876]
cdf: [0.010279 0.093035 0.347585 0.667741 0.867593]
fit (K, loc, scale): (1.1180461380307114, -0.4026713345023387, 0.8899021130108721)
10) Statistical Use Cases#
10.1 Hypothesis testing / anomaly detection#
If you have a baseline model (X\sim\mathrm{exponnorm}(K,\text{loc},\text{scale})), then an unusually large observation (x_\text{obs}) can be scored by a right-tail probability:
[ \text{p-value} = \Pr(X \ge x_\text{obs}) = \mathrm{sf}(x_\text{obs}). ]
For model comparison (e.g., “is a normal enough?”), likelihood-ratio statistics are natural — but because (K\ge 0) and the normal corresponds to the boundary case (K=0), a parametric bootstrap is a safer way to calibrate the test than relying on a (\chi^2) reference.
10.2 Bayesian modeling#
A Bayesian model uses the same latent decomposition:
[ x_i = \mu + \sigma z_i + d_i,\quad z_i\sim\mathcal{N}(0,1),\quad d_i\sim\mathrm{Exp}(\lambda). ]
Choose priors like:
(\mu\sim\mathcal{N}(m_0,s_0^2))
(\sigma\sim\mathrm{HalfNormal}(s))
(\lambda\sim\mathrm{Gamma}(\alpha,\beta))
This representation is convenient in probabilistic programming systems (Stan, PyMC, Turing).
10.3 Generative modeling#
exponnorm is a useful component distribution for skewed data:
mixture models (mixture of exGaussians)
emission models in HMMs for duration-like observations
synthetic data generation for latency / response-time benchmarks
# 10.1 Tail probability (anomaly scoring) + bootstrap LRT vs Normal
# Tail probability under a baseline model
K0, loc0, scale0 = 0.9, -0.3, 1.1
rv0 = stats.exponnorm(K0, loc=loc0, scale=scale0)
x_obs = float(rv0.ppf(0.995))
p_value = float(rv0.sf(x_obs)) # P(X >= x_obs)
print("x_obs:", x_obs)
print("p_value (right tail):", p_value)
# Model comparison: ExponNorm vs Normal via parametric bootstrap
n = 400
x = rv0.rvs(size=n, random_state=rng)
# Fit Normal (2 params)
mu_hat, sigma_hat = stats.norm.fit(x)
ll_norm = float(np.sum(stats.norm.logpdf(x, loc=mu_hat, scale=sigma_hat)))
# Fit ExponNorm (3 params)
K_hat, loc_hat, scale_hat = stats.exponnorm.fit(x)
ll_expn = float(np.sum(stats.exponnorm.logpdf(x, K_hat, loc=loc_hat, scale=scale_hat)))
T_obs = 2.0 * (ll_expn - ll_norm)
B = 30 # increase for a more stable p-value
T_boot = []
for _ in range(B):
xb = stats.norm.rvs(loc=mu_hat, scale=sigma_hat, size=n, random_state=rng)
mu_b, sigma_b = stats.norm.fit(xb)
ll_n_b = float(np.sum(stats.norm.logpdf(xb, loc=mu_b, scale=sigma_b)))
K_b, loc_b, scale_b = stats.exponnorm.fit(xb)
ll_e_b = float(np.sum(stats.exponnorm.logpdf(xb, K_b, loc=loc_b, scale=scale_b)))
T_boot.append(2.0 * (ll_e_b - ll_n_b))
p_boot = float(np.mean(np.asarray(T_boot) >= T_obs))
print("T_obs:", T_obs)
print("bootstrap p-value (Normal is enough?):", p_boot)
# 10.3 Simple mixture of two exponnorm components (generative modeling)
w = 0.6
n = 60_000
n1 = int(w * n)
n2 = n - n1
x1 = sample_exponnorm_numpy(n1, K=0.4, loc=-1.0, scale=0.7, rng=rng)
x2 = sample_exponnorm_numpy(n2, K=1.5, loc=1.0, scale=0.9, rng=rng)
x_mix = np.concatenate([x1, x2])
rng.shuffle(x_mix)
fig = px.histogram(
x_mix,
nbins=90,
histnorm="probability density",
title="Mixture of two `exponnorm` components",
labels={"value": "x"},
)
fig.show()
x_obs: 5.556442873253898
p_value (right tail): 0.004999999999999975
T_obs: 16.563440711130625
bootstrap p-value (Normal is enough?): 0.0
11) Pitfalls#
Invalid parameters: SciPy requires
K > 0andscale > 0.Parameterization confusion: many references use ((\mu,\sigma,\lambda)); SciPy uses (K=1/(\sigma\lambda)). Track whether you mean rate (\lambda) or mean delay (\tau=1/\lambda).
Right-skew only:
exponnormproduces positive skew. If your data is left-skewed, consider reflecting the data or using a different family.Fitting near (K\approx 0): the normal is a boundary case; optimization can be sensitive and standard LRT asymptotics may fail.
Tail numerics: for extreme (x),
1 - cdf(x)can suffer catastrophic cancellation. Prefersf/logsf, and preferlogpdfoverpdfin the far tail.
# Numerical pitfall demo: 1 - cdf vs sf in the far right tail
K, loc, scale = 1.2, 0.0, 1.0
rv = stats.exponnorm(K, loc=loc, scale=scale)
x = 50.0
sf_naive = 1.0 - rv.cdf(x)
sf_stable = rv.sf(x)
logsf = rv.logsf(x)
{"1-cdf": sf_naive, "sf": float(sf_stable), "logsf": float(logsf)}
{'1-cdf': 0.0, 'sf': 1.1355160639014264e-18, 'logsf': -41.31944444444445}
12) Summary#
exponnormis a continuous distribution on (\mathbb{R}) modeling Normal + Exponential (noise + delay).SciPy parameters: (K>0),
loc,scale; mean (= \text{loc} + K,\text{scale}), variance (= \text{scale}^2(1+K^2)).Shape depends only on (K): skewness (\gamma_1 = 2K^3/(1+K^2)^{3/2}), excess kurtosis (\gamma_2 = 6K^4/(1+K^2)^2).
Sampling is simple with NumPy:
normal + exponential, then an affine transform.In practice, prefer
logpdf,logcdf, andsf/logsffor tail computations, and treatfitnear (K\approx 0) with care.
References#
SciPy docstring:
scipy.stats.exponnormExponentially modified Gaussian distribution (Wikipedia)